AITopics | adversarial imitation

a862f5788fd09bb6843c694d8120d50c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 05:31:23 GMT

agent, demonstration, imitation, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Add feedback

Diffusion Imitation from Observation

Neural Information Processing SystemsDec-27-2025, 14:01:44 GMT

Learning from Observation (LfO) aims to imitate experts by learning from state-only demonstrations without requiring action labels. Existing adversarial imitation learning approaches learn a generator agent policy to produce state transitions that are indistinguishable to a discriminator that learns to classify agent and expert state transitions. Despite its simplicity in formulation, these methods are often sensitive to hyperparameters and brittle to train. Motivated by the recent success of diffusion models in generative modeling, we propose to integrate a diffusion model into the adversarial imitation learning from observation framework. Specifically, we employ a diffusion model to capture expert and agent transitions by generating the next state, given the current state. Then, we reformulate the learning objective to train the diffusion model as a binary classifier and use it to provide ``realness'' rewards for policy learning. Our proposed framework, Diffusion Imitation from Observation (DIFO), demonstrates superior performance in various continuous control domains, including navigation, locomotion, manipulation, and games.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Error Bounds of Imitating Policies and Environments

Neural Information Processing SystemsDec-24-2025, 11:28:33 GMT

Imitation learning trains a policy by mimicking expert demonstrations. Various imitation methods were proposed and empirically evaluated, meanwhile, their theoretical understanding needs further studies. In this paper, we firstly analyze the value gap between the expert policy and imitated policies by two imitation methods, behavioral cloning and generative adversarial imitation. The results support that generative adversarial imitation can reduce the compounding errors compared to behavioral cloning, and thus has a better sample complexity. Noticed that by considering the environment transition model as a dual agent, imitation learning can also be used to learn the environment model. Therefore, based on the bounds of imitating policies, we further analyze the performance of imitating environments. The results show that environment models can be more effectively imitated by generative adversarial imitation than behavioral cloning, suggesting a novel application of adversarial imitation for model-based reinforcement learning. We hope these results could inspire future advances in imitation learning and model-based reinforcement learning.

adversarial imitation, imitating policy and environment, imitation, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)

Add feedback

Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning

Li, Shangzhe, Zhou, Dongruo, Zhang, Weitong

arXiv.org Artificial IntelligenceOct-15-2025

We study online adversarial imitation learning (AIL), where an agent learns from offline expert demonstrations and interacts with the environment online without access to rewards. Despite strong empirical results, the benefits of online interaction and the impact of stochasticity remain poorly understood. We address these gaps by introducing a model-based AIL algorithm (MB-AIL) and establish its horizon-free, second-order sample-complexity guarantees under general function approximations for both expert data and reward-free interactions. These second-order bounds provide an instance-dependent result that can scale with the variance of returns under the relevant policies and therefore tighten as the system approaches determinism. Together with second-order, information-theoretic lower bounds on a newly constructed hard-instance family, we show that MB-AIL attains minimax-optimal sample complexity for online interaction (up to logarithmic factors) with limited expert demonstrations and matches the lower bound for expert demonstrations in terms of the dependence on horizon $H$, precision $ε$ and the policy variance $σ^2$. Experiments further validate our theoretical findings and demonstrate that a practical implementation of MB-AIL matches or surpasses the sample efficiency of existing methods.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2510.09487

Country:

North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learn what matters: cross-domain imitation learning with task-relevant embeddings

Neural Information Processing SystemsAug-17-2025, 11:27:50 GMT

We jointly train the learner agent's policy and learn

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Add feedback

Diffusion Imitation from Observation

Neural Information Processing SystemsMay-27-2025, 21:34:26 GMT

Learning from Observation (LfO) aims to imitate experts by learning from state-only demonstrations without requiring action labels. Existing adversarial imitation learning approaches learn a generator agent policy to produce state transitions that are indistinguishable to a discriminator that learns to classify agent and expert state transitions. Despite its simplicity in formulation, these methods are often sensitive to hyperparameters and brittle to train. Motivated by the recent success of diffusion models in generative modeling, we propose to integrate a diffusion model into the adversarial imitation learning from observation framework. Specifically, we employ a diffusion model to capture expert and agent transitions by generating the next state, given the current state.

diffusion imitation, diffusion model, observation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration

Zhou, Yirui, Liu, Xiaowei, Zhang, Xiaofeng, Zhang, Yangchun

arXiv.org Machine LearningJan-22-2025

Imitation learning (IL) (Pomerleau, 1991; Ng et al., 2000; Syed and Schapire, 2007; Ho and Ermon, 2016), a realm distinct from standard reinforcement learning (RL) (Puterman, 2014; Sutton and Barto, 2018), is independent on rewards provided by the environment. This characteristic makes IL particularly suited for numerous real-world applications (Bhattacharyya et al., 2018; Shi et al., 2019; Jabri, 2021). The general IL paradigm leverages the guidance from expert demonstrations with information of both states and actions to mimic an outstanding policy (Abbeel and Ng, 2004; Ho and Ermon, 2016; Kostrikov et al., 2020). According to the strategy of policy training, IL is divided into two main schemes based on policy training strategy: on-policy and off-policy training. The on-policy scheme (Ho and Ermon, 2016; Chen et al., 2020) is noted for its stability but requires a significant volume of samples.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Machine Learning

2501.12785

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China (0.04)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Error Bounds of Imitating Policies and Environments

Neural Information Processing SystemsOct-11-2024, 04:08:54 GMT

Imitation learning trains a policy by mimicking expert demonstrations. Various imitation methods were proposed and empirically evaluated, meanwhile, their theoretical understanding needs further studies. In this paper, we firstly analyze the value gap between the expert policy and imitated policies by two imitation methods, behavioral cloning and generative adversarial imitation. The results support that generative adversarial imitation can reduce the compounding errors compared to behavioral cloning, and thus has a better sample complexity. Noticed that by considering the environment transition model as a dual agent, imitation learning can also be used to learn the environment model.

adversarial imitation, imitating policy and environment, imitation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Versatile Skill Control via Self-supervised Adversarial Imitation of Unlabeled Mixed Motions

Li, Chenhao, Blaes, Sebastian, Kolev, Pavel, Vlastelica, Marin, Frey, Jonas, Martius, Georg

arXiv.org Artificial IntelligenceFeb-11-2023

Learning diverse skills is one of the main challenges in robotics. To this end, imitation learning approaches have achieved impressive results. These methods require explicitly labeled datasets or assume consistent skill execution to enable learning and active control of individual behaviors, which limits their applicability. In this work, we propose a cooperative adversarial method for obtaining single versatile policies with controllable skill sets from unlabeled datasets containing diverse state transition patterns by maximizing their discriminability. Moreover, we show that by utilizing unsupervised skill discovery in the generative adversarial imitation learning framework, novel and useful skills emerge with successful task fulfillment. Finally, the obtained versatile policies are tested on an agile quadruped robot called Solo 8 and present faithful replications of diverse skills encoded in the demonstrations.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2209.07899

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving

Bronstein, Eli, Palatucci, Mark, Notz, Dominik, White, Brandyn, Kuefler, Alex, Lu, Yiren, Paul, Supratik, Nikdel, Payam, Mougin, Paul, Chen, Hongge, Fu, Justin, Abrams, Austin, Shah, Punit, Racah, Evan, Frenkel, Benjamin, Whiteson, Shimon, Anguelov, Dragomir

arXiv.org Artificial IntelligenceOct-17-2022

We demonstrate the first large-scale application of model-based generative adversarial imitation learning (MGAIL) to the task of dense urban self-driving. We augment standard MGAIL using a hierarchical model to enable generalization to arbitrary goal routes, and measure performance using a closed-loop evaluation framework with simulated interactive agents. We train policies from expert trajectories collected from real vehicles driving over 100,000 miles in San Francisco, and demonstrate a steerable policy that can navigate robustly even in a zero-shot setting, generalizing to synthetic scenarios with novel goals that never occurred in real-world driving. We also demonstrate the importance of mixing closed-loop MGAIL losses with open-loop behavior cloning losses, and show our best policy approaches the performance of the expert. We evaluate our imitative model in both average and challenging scenarios, and show how it can serve as a useful prior to plan successful trajectories.

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2210.09539

Country:

North America > United States > California > San Francisco County > San Francisco (0.24)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (0.83)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Collaborating Authors

adversarial imitation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

a862f5788fd09bb6843c694d8120d50c-Paper-Conference.pdf

Diffusion Imitation from Observation

Error Bounds of Imitating Policies and Environments

Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning

Learn what matters: cross-domain imitation learning with task-relevant embeddings

Diffusion Imitation from Observation

On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration

Error Bounds of Imitating Policies and Environments

Versatile Skill Control via Self-supervised Adversarial Imitation of Unlabeled Mixed Motions

Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving